title: “NVIDIA NIM” description: “Use Weave to trace and log LLM calls made via the ChatNVIDIA library”
Weave automatically tracks and logs LLM calls made via the ChatNVIDIA library, afterweave.init()
is called.
For the latest tutorials, visit Weights & Biases on NVIDIA.
Tracing
It’s important to store traces of LLM applications in a central database, both during development and in production. You’ll use these traces for debugging and to help build a dataset of tricky examples to evaluate against while improving your application.Weave can automatically capture traces for the ChatNVIDIA python library.Start capturing by calling
weave.init(<project-name>)
with a project name your choice.
Track your own ops
Wrapping a function with Navigate to Weave and you can click
@weave.op
starts capturing inputs, outputs and app logic so you can debug how data flows through your app. You can deeply nest ops and build a tree of functions that you want to track. This also starts automatically versioning code as you experiment to capture ad-hoc details that haven’t been committed to git.Simply create a function decorated with @weave.op
that calls into ChatNVIDIA python library.In the example below, we have 2 functions wrapped with op. This helps us see how intermediate steps, like the retrieval step in a RAG app, are affecting how our app behaves.get_pokemon_data
in the UI to see the inputs & outputs of that step.
Create a Model
for easier experimentation
Organizing experimentation is difficult when there are many moving pieces. By using the
Model
class, you can capture and organize the experimental details of your app like your system prompt or the model you’re using. This helps organize and compare different iterations of your app.In addition to versioning code and capturing inputs/outputs, Model
s capture structured parameters that control your application’s behavior, making it easy to find what parameters worked best. You can also use Weave Models with serve
, and Evaluation
s.In the example below, you can experiment with model
and system_message
. Every time you change one of these, you’ll get a new version of GrammarCorrectorModel
.
Usage Info
The ChatNVIDIA integration supportsinvoke
, stream
and their async variants. It also supports tool use.
As ChatNVIDIA is meant to be used with many types of models, it does not have function calling support.